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Ca 2+ /nuclear factor of activated T-cells (Ca 2+ /NFAT) signaling pathway may play a crucial role in Kawasaki 
disease (KD). We investigated 16 genetic variants, selected by bioinformatics analyses or previous studies, in 
7 key genes involved in this pathway in a Chinese population. We observed a significantly or marginally 
increased KD risk associated with rs2720378 GC + CC genotypes (OR = 1.39, 95% CI = 1.07-1.80, P = 
0.014) or rs2069762 AC + CC genotypes (OR = 1.28, 95% CI = 0.98-1.67, P = 0.066), compared with their 
wild type counterparts. In classification and regression tree analysis, individuals carrying the combined 
genotypes of rs2720378 GC or CC genotype, rs2069762 CA or CC genotype and rsl561876 AA genotype 
exhibited the highest KD risk (OR = 2.12, 95% CI = 1.46-3.07, P < 0.001), compared with the lowest risk 
carriers of rs2720378 GG genotype. Moreover, a significant dose effect was observed among these three 
variants (Ptrend < 0.001). In conclusion, this study implicates that single- and multiple-risk genetic variants 
in this pathway might contribute to KD susceptibility. Further studies on more comprehensive single 
nucleotide polymorphisms, different ethnicities and larger sample sizes are warranted, and the exact 
biological mechanisms need to be further clarified. 

Kawasaki disease (KD; OMIM 611775), also called mucocutaneous lymph node syndrome, is an acute, self- 
limited vasculitis that predominantly affects children younger than 5 years old, which was first reported by 
Tomisaku Kawasaki from Japan in 1974 in the English language literature 1 . KD occurs worldwide and is 
mostly prevalent in East Asian population, such as Japanese 2 , Koreans 3 and Taiwanese 4 . The major clinical 
manifestations of KD include prolonged fever, bilateral non-purulent conjunctivitis, diffuse mucosal inflam- 
mation, polymorphous skin rashes, peripheral extremity changes, and cervical lymphadenopathy 5 6 . 
Approximately 15-25% of untreated and 3-5% of treated children develop coronary artery lesions, including 
coronary artery dilatation or coronary artery aneurysm, and even myocardial infarction 7,8 , making KD the leading 
cause of acquired heart disease in childhood in the developed countries. 

Although almost 40 years have passed since the first description of the disease, its etiology is still not completely 
clear. However, the familial aggregation of KD patients suggests that genetic factors contribute to KD suscept- 
ibility and outcome 9,10 . Previous association studies based on the candidate gene approach have identified a set of 
common variants contributing to KD risk, but few received consistent replications. Inositol 1,4,5-triphosphate 3- 
kinase C (ITPKC) gene may be one of the most studied targets in genetic susceptibility of KD. A genome-wide 
linkage analysis conducted in Japanese KD sibling-pair samples led to the identification of a functional single 
nucleotide polymorphism (SNP), rs28493229 in the ITPKC gene on chromosome 19ql3.2 11,12 , which was con- 
firmed subsequently by a meta-analysis integrating 10 case-control studies and 2 transmission/disequilibrium 
tests 13 . ITPKC encodes one of the three isoenzymes of inositol 1,4,5-triphosphate 3-kinase that phosphorylates 
inositol 1,4,5-triphosphate, and is involved in the Ca 2+ /nuclear factor of activated T-cells (NFAT) signaling 
pathway in T cells as a negative regulator 12 . Besides, another SNP rs2290692 in this gene was also found associated 
with KD susceptibility in a Chinese population 14 . 

It is believed that T-cell activation plays an important role in the pathogenesis of vascular endothelial cell injury 
by eliciting proinflammatory reactions at the onset of KD 15,16 , among which, Ca 2+ /NFAT signaling pathway may 
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Table 1 | Association between individual SNP and KD risk 

W/H/V frequency 



Reference/ 



Chr Gene rs ID MAF in CHB* KD cases Controls Variant allele OR f 95% CI LR P 



4 



1 1 



18 



19 



20 



CASP3 


rs2720378 


0.326 


1 84/1 86/56 


253/191/49 


G/C 


1.39 s 


1.07- 


1.80 


0.014 


IL-2 


rs2069762 


0.239 


170/193/55 


230/210/52 


A/C 


1 .28 s 


0.98- 


1 .67 


0.066 


STIM2 


rs 10263 


0.367 


1 74/1 96/53 


207/220/65 


A/G 


0.94 1 


0.64- 


1 .39 


0.760 


STIM1 


rs 156 1876 


0.344 


244/151/23 


266/193/32 


A/G 


0.84 s 


0.65- 


1.10 


0.204 


STIM1 


rs3750994 


0.244 


298/1 12/16 


334/137/20 


T/G 


0.91 s 


0.69- 


1.21 


0.529 


STIM1 


rs3750996 


0.133 


249/142/19 


287/172/25 


A/G 


0.94 s 


0.72- 


1.23 


0.663 


NFATcl 


rs9966033 


0.378 


158/209/53 


1 84/222/74 


T/C 


0.791 


0.54- 


1.16 


0.230 


NFATcl 


rs754093 


0.31 1 


176/197/45 


222/206/63 


T/G 


0.82 1 


0.55- 


1.23 


0.338 


NFATcl 


rs7227107 


0.367 


167/201/54 


1 94/224/68 


A/G 


0.90 1 


0.62- 


1.33 


0.600 


ITPKC 


rs28493229 


0.070" 


359/63/4 


422/71/0 


G/C 


1.11 s 


0.77- 


1.59 


0.573 


ITPKC 


rs 10420685 


0.178 


230/158/38 


267/184/40 


A/G 


1.10 1 


0.69- 


1.76 


0.675 


ITPKC 


rs2290692 


0.433 


109/190/1 12 


1 1 1/238/1 19 


G/C 


0.86 s 


0.64- 


1.17 


0.344 


ITPKC 


rs2561530 


0.270" 


230/160/33 


258/202/28 


A/G 


1.39 1 


0.83- 


2.34 


0.215 


ITPKC 


rs4802085 


0.500 


129/187/103 


129/238/102 


G/C 


1.17 1 


0.86- 


1.60 


0.317 


NFATc2 


rs6013193 


0.330 


1 50/2 1 0/67 


157/251/84 


T/G 


0.87 s 


0.66- 


1.14 


0.302 



Abbreviations: MAF, minor allele frequency; CHB, Han Chinese in Beijing; Chr, chromosome; W, wild type homozygote; H, heterozygote; V, variant homozygote; OR, odds ratio; 95%CI, 95% confidence 
interval; LR, logistic regression 

*MAF was downloaded from the online database of HapMap for Han Chinese in Beijing, China 
f OR calculation was conducted under assumption that variant alleles were risk alleles 
*AII the Pvalues were adjusted for gender. The positive results were in bold 
s The inheritance model of the risk alleles was dominant 
T The inheritance model of the risk alleles was recessive 

"MAF was downloaded from the online database of 1 000 Genomes for Han Chinese in Beijing, China 



play a crucial role. Once the T-cell receptor receives a stimulus, 
phospholipase Cyl is activated, and then generates amount of dia- 
cylglycerol and inositol 1,4,5-triphosphate, the latter of which then 
binds to its receptor expressed on endoplasmic reticulum (ER) mem- 
brane and causes the release of Ca 2+ into the cytoplasm. The deple- 
tion of Ca 2+ store in ER leads to a process that extracellular Ca 2+ 
enters through calcium released-activated Ca 2+ channels on the 
plasma membrane, which is evoked by stromal interaction molecule 
(STIM) as a sensor of Ca 2+ in ER. Calcineurin is activated by the bond 
between cytoplasmic Ca 2+ and calmodulin, afterwards dephosphor- 
ylates NFAT in the cytoplasm and leads nuclear translocation of 
NFAT. NFAT in the nucleus drives transcription of genes important 
in T cell activation such as IL-2 17 . 

Caspase 3 (CASP3) is one of the effector caspases that play an 
important role in the execution phase of apoptosis. It has been 
reported that CASP3 cleaves NFATc2 as one of its substrates in T 
cells 18 , and also acts as a negative regulator of the Ca 2+ /NFAT path- 
way. In addition, another study conducted by Onouchi et al. revealed 
that a G to A substitution of one commonly associated SNP located in 
the 5 '-untranslated region of CASP3 (rsll3420705, formerly 
rs72689236) abolished binding of NFAT to the DNA sequence sur- 
rounding the SNP 19 . 

Compared to ITPKC and CASP3, few studies have investigated on 
other loci involved in the Ca 2+ /NFAT signaling pathway and KD 
susceptibility, such as STIM1, STIM2, NFATcl, NFATc2, IL-2, in 
spite of their contributions to the immune and inflammatory sys- 
tem 20 " 25 . Therefore, we proposed our hypothesis that multiple com- 
mon genetic variants involved in Ca 2+ /NFAT signaling pathway may 
individually or interactively contribute to the etiology and pathogen- 
esis of KD. Referring to the interactions, it could be challenging for 
the traditional analytic strategies, such as logistic regression (LR), to 
fully characterize them since sparseness of the data in high dimen- 
sions would lead to large standard errors of parameter estimates, and 



result in an increase in type I errors 26,27 . What's more, statistical 
power would decrease and type II errors would increase when detect- 
ing interactions by LR in a relatively small sample size 28 . Thus, a non- 
parametric data mining approach, classification and regression tree 
(CART) has been applied to explore high-order gene-gene and gene- 
environment interactions, given its statistical advantages in over- 
coming the inaccuracy parameter estimates and providing good 
power for identifying high-order interactions 29 . 

Here, we investigated 16 SNPs which were predicted to be func- 
tional or identified by previous studies in 7 candidate genes involved 
in the Ca 2+ /NFAT signaling pathway in a case-control study of a 
Chinese population. We examined the individual and combined 
effects of these 16 variants by traditional unconditional LR model 
and high-order gene-gene interactions in modulating KD risk using 
CART analysis. 

Results 

Subjects characteristics. A total of 428 KD children and 493 healthy 
controls were enrolled in this study. There were 263 (61%) males in 
cases and 303 (61%) males in controls, and no statistically significant 
difference was observed between cases and controls in the 
distribution of gender (Pearson f = 0.000, P = 0.997). 

Association analysis between individual SNP and KD risk. The 

genotyping call rates of all the 16 SNPs were >95% except CASP3 
rsll3420705 (92%), so that the SNP rsll3420705 was not analyzed 
any further. The genotype distributions of the remaining 15 SNPs in 
our control subjects were all in the Hardy- Weinberg equilibrium 
(HWE, P > 0.05) and their minor allele frequencies (MAFs) were 
similar to those in HapMap database of Han Chinese in Beijing, 
China (CHB, Supplementary Table SI online). As shown in 
Table 1, two SNPs, CASP3 rs2720378 and IL-2 rs2069762 were signifi- 
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13 
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AA 
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Node 3 


Category 


% n 
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43.8 91 
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22.6 208 
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55.3 152 


Total 


29.9 275 
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AA 
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52.1 


61 
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47.9 


56 


Total 


12.7 


117 
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n 


control 39.2 


62 


■ case 60.8 


96 


Total 17.2 


158 


■ 



Figure 1 | CART analysis of genetic variants in Ca 2+ /NFAT signaling pathway and KD risk. Cases and controls are denoted by white and black box 
respectively, and each node contains frequencies and percentages of cases and controls in each subgroup. 

cantly or marginally associated with the increased risk of KD (odds Association of high-order interactions with KD risk by CART 
ratio (OR) = 1.39, 95% confidence interval (CI) = 1.07-1.80, P = analysis. In the final optimal decision tree generated by the CART 
0.014; OR = 1.28, 95% CI = 0.98-1.67, P = 0.066), both under the analysis (Figure 1), the initial split of the root node was CASP3 
dominant model. rs2720378, and patients harboring rs2720378 C allele (GC or CC 



SCIENTIFIC REPORTS | 4:5208 | DOI: 10.1038/srep05208 



3 



Table 2 Risk estimates of CART terminal nodes 


Node Genotype 


KD cases 


Controls 


Case rate* (%) 


OR 1 


95% CI LR P 


1 rs2720378 (W) 

3 ^9790378 HVI r<?06976? IWI 

5 rs2720378 (HV)-rs2069762 (HV)-rs 1 56 1 876 (HV) 

6 rs2720378 (HV)-rs2069762 (HV)-rsl 561 876 (W) 


185 (43.22) 
91 (21 26) 
56 (13.08) 
96 (22.43) 


253 (51.32) 
1 17(23.73) 

61 (12.37) 

62 (12.58) 


42.2 
43.8 
47.9 
60.8 


Ref. 
1.06 
1.26 
2.12 


Ref. 

0.76-1.48 0.718 
0.83-1.89 0.276 
1.46-3.07 <0.001 


Abbreviations: CART, classification and regression tree; KD, Kawasaki disease; OR, 
variant homozygote 

*Case rate is the percentage of KD patients among all individuals in each node 
f ORs of terminal nodes were calculated by LR analysis adjusted for gender 
*AII the Pvalues were adjusted for gender. The positive results were in bold 


odds ratio; 95%CI, 95% 


confidence interval; LR, logistic regressior 


; W, wild type homozygote; H, heterozygote; V, 



genotype) had a higher risk to suffer KD compared with patients with 
GG genotype, suggesting that CASP3 rs2720378 was the strongest 
risk factor for KD among the 15 SNPs examined. A deeper 
exploration of the classification tree structure demonstrated 
distinct interaction patterns between individuals carrying 
rs2720378 GC or CC genotype and those with rs2720378 GG 
genotype. Individuals harboring GG genotype had the lowest risk 
for KD with a rate of 42% cases, thus we considered this terminal 
node as the reference. As shown in Table 2, individuals carrying the 
combination of CASP3 rs2720378 GC or CC genotype, IL-2 
rs2069762 AC or CC genotype and STIM1 rsl561876 AA genotype 
exhibited the highest riskfor KD (OR = 2.12, 95% CI = 1.46-3.07, P 
< 0.001). 

Afterwards, we attempted to test pairwise interactions between the 
three SNPs (rs2720378, rs2069762 and rsl561876) by the usual 
logistic regression analysis. However, we did not find any positive 
result on the interaction terms, regardless of multiplicative inter- 
action or additive interaction (Supplementary Table S2 online). 

Cumulative effect analysis. A cumulative effect of the 3 SNPs 
identified in the CART analysis was evaluated by LR analysis with 
the rs2720378 C allele, rs2069762 C allele and rsl561876 A allele as 
risk alleles. Participants were categorized into 3 groups based on the 
number of risk alleles they harbored (0 ~ 2, 3 ~ 4, 5 ~ 6 as individual 
group respectively, for the number of 0 risk allele subgroup was 
considerably small) and the 0 — 2 risk alleles subgroup was 
regarded as the reference group. Both of the subgroups with 3 — 4 
risk alleles and 5 — 6 risk alleles showed marginally significant or 
significant associations with increased KD risk (OR = 1.32, 95% CI 
= 1.00-1.74, P = 0.052; OR = 2.93, 95% CI = 1.66-5.17, P < 0.001), 
and the Cochran-Armitage trend test indicated a significant dose 
effect with the ORs being increased with increasing numbers of 
risk alleles (P tl . en d < 0.001, Table 3). 

Discussion 

KD is an immune-mediated multi-systemic vasculitis and the most 
widely proposed consensus is that it results from an unknown infec- 
tion trigger in part of genetically susceptible hosts 19,30 . A number of 
genes have been reported to have significant associations with the 
susceptibility to KD in different populations, such as CDWL? 1 , HLA- 
E 32 , FAM167A-BLK 33 , FCGR2A 3i , through candidate gene 
approaches or genome-wide association studies. However, most of 



these genes were discrete no matter which strategies were applied. 
Transforming growth factor-beta signaling pathway, as the only 
pathway studied on KD hitherto, has been proved to play an import- 
ant role in KD pathogenesis and genetic variants in three genes in the 
pathway (TGFB2, TGFBR2, SMAD3) influence KD susceptibility, 
coronary artery aneurysms, aortic root dilatation and intravenous 
immunoglobulin treatment response 35 . 

To the best of our knowledge, this is probably the first study on the 
single- and multiple-risk of genetic variants in the Ca 2+ /NFAT sig- 
naling pathway and KD risk in a Chinese population. In our study, 
the candidate SNPs were predicted to be functional by bioinformatics 
analyses or identified by previous studies. We first demonstrated that 
variants of CASP3 rs2720378 and IL-2 rs2069762 were nominally 
associated with increased risk of KD. The following CART analysis 
revealed the prediction value of gene-gene interactions among 
CASP3 rs2720378, IL-2 rs2069762 and STIM1 rsl561876 variants 
on KD risk, and the cumulative analysis further indicated their syn- 
ergetic effect on KD risk. 

Ca 2+ /NFAT signaling pathway plays a crucial role in the immune 
cell functions, including cell proliferation, cytokine gene expression, 
differentiation, and cell death 36 . Accumulated evidence has indicated 
that any abnormal expression or anergy caused by mutations of any 
component in Ca 2+ /NFAT signaling pathway could be closely related 
to the change of signal intensity or transmission, thus contributing to 
the pathogenesis or progression of diseases, especially the immune- 
related diseases 37 42 . As far as our current findings were concerned, 
one of the positive results on SNP rs2720378 was also observed in one 
previous study conducted by Onouchi et al. 19 . The investigators 
explored the biological effect of CASP3 rsl 13420705, and found that 
the G > A substitution weakened the binding between CASP3 and 
NFATc2, reduced CASP3 transcription, and presumably would 
inhibit T-cell apoptosis. In addition, rs2720378 was in tight linkage 
disequilibrium (LD) with rsl 13420705 in Japanese (r 2 = 0.88). 
Accordingly, we could speculate that rs2720378 might not be the 
"real" causal variant but only a "proxy". However, as Onouchi 
et al. pointed out in the study, there remained a possibility that 
rs2720378 also affected CASP3 expression by other, unknown 
mechanisms. Using the in silico tool SNPinfo 43 , rs2720378 was clas- 
sified as affecting transcription-factor-binding site (TFBS) activity, 
with the moderate difference score between the two alleles, which 
might be a possible hint. On the other hand, SNP rs2069762, a single 
base change proximal to the IL-2 promoter region ( — 330A > C), was 



Table 3 | Cumulative effect of the 3 SNPs (rs2720378 rs2069762 rsl 561 876) between KD patients and normal controls 
Number of risk alleles KD cases (%) Controls (%) OR 95% CI LR P* 



0-2 144(35.38) 216(44.08) Ref. Ref. 

3-4 222(54.55) 253(51.63) 1.32 1.00-1.74 0.052 

5-6 41 (10.07) 21 (4.29) 2.93 1.66-5.17 <0.001 

Cochran-Armitage Trend Test P < 0.001 

Abbreviations: KD, Kawasaki disease; OR, odds ratio; 95%CI, 95% confidence interval; LR, logistic regression 
*AII the Pvalues were adjusted for gender. The positive results were in bold 
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also predicted as affecting TFBS activity, where the C allele had a 
significantly higher score than the A allele. What's more, it has been 
proved that individuals homozygous for the C allele of rs2069762 
produced over three times the amount of IL-2 than their AA and AC 
counterparts after stimulation with anti-CD3/CD28 25 , and then cor- 
related with a series of inflammatory or immunological diseases, 
such as ulcerative colitis 40 , multiple sclerosis 41 , and childhood lym- 
phoma 42 . Therefore, we considered that it was biologically plausible 
that SNP rs2069762 was also associated with increased KD risk. 
Referring to STIM1 rsl561876, few studies have investigated its effect 
on KD risk, but a recent study have demonstrated that the G allele of 
rsl561876 was associated with lower intracellular l-)3-D-arabinofur- 
anosyl-CTP levels, inferior response after remission induction ther- 
apy, greater risk of relapse and poorer overall survival in acute 
myeloid leukemia patients receiving cytarabine and cladribine 44 . 
Moreover, this SNP was predicted as affecting TFBS activity as well 
as microRNA-binding site activity, which might provide a clue for 
further study in the future. The multiple-risk genetic variants 
detected by CART, CASP3 rs2720378, IL-2 rs2069762 and STIM1 
rsl561876, suggested a stronger combined effect of SNPs, which 
might evoke researchers to pay more attention to the gene-gene 
interactions contributing to KD risk if they really exist. 

Although the association of ITPKC rs28493229 with KD suscept- 
ibility has been reported in patients of Japanese and American ethni- 
cities 12 ' 45 , we failed to reproduce such findings in our current study, 
which was in accordance with a recent study on Chinese population 
from Sichuan province of China in 2012 14 . The relatively low statist- 
ical power caused by insufficient number of study subjects might be 
taken into account. Moreover, the significantly lower frequency of 
the risk- conferring C allele of rs28493229 in CHB (7%) compared 
with Japanese in Tokyo or Americans (both are 15%), which data was 
obtained from 1000 Genomes (http://www.1000genomes.org/), 
might simply reflect the ethnic variation. 

Considering the vital importance of Ca 2+ /NFAT pathway in 
pathogenesis of KD, exploring novel inhibitors against this pathway 
seems to be promising in treatment of KD. Recently, two clinical 
trials of cyclosporine A, which potently suppresses the activity of T 
cells as a calcineurin inhibitor, have indicated that cyclosporine A 
appeared to be a safe and effective approach for patients with refract- 
ory KD 46 - 47 . 

Certain limitations to this study should be noted. Firstly, the can- 
didate genes we studied were the primary but not all the genes 
involved in the Ca 2+ /NFAT pathway, and the screening strategy of 
candidate SNPs was not so rigorous that the SNPs we selected could 
not cover all the probably functional SNPs among these gene regions. 
Thus, it might be unable to fully reflect the genetic effects of this 
pathway on KD susceptibility and might reduce the effectiveness of 
our conclusions. Secondly, these findings were preliminary since the 
sample size was relatively small and they were not replicated in this 
current study. Thirdly, the lack of information about environment 
factors, such as family history, infection history, which might play 
roles in KD onset, limited our further investigation of gene-envir- 
onment interactions. Fourthly, the significant associations we 
observed were merely on the statistical level, whereas we did not 
evaluate their real biological functions. 

In summary, our study implicates the importance of the single- and 
multiple-risk variants in Ca 2+ /NFAT signaling pathway in KD sus- 
ceptibility, and provides a clue to detect the potential gene-gene inter- 
actions conferring KD risk. Further studies on more comprehensive 
SNPs, different ethnicities and larger sample sizes are warranted, and 
follow-up studies are also needed to further identify the true causal 
variants to reveal the biological mechanism of genetic etiology. 

Methods 

Study subjects. This study contained 428 children diagnosed with KD, all of which 
were unrelated ethnic Han Chinese and were consecutively recruited between April 
2009 and September 2012 from Children's Hospital, Zhejiang University School of 



Medicine, China. The diagnosis of KD was based on the 5 th revised edition of the 
guidelines established by the Kawasaki Disease Research Committee in Japan in 2002 
(http://kawasaki-disease.org/diagnostic/). The controls comprised 493 ethnically and 
gender-matched unrelated healthy Chinese children without any evidence of 
infection. They were recruited from the same hospital at the time of a routine physical 
examination. 

This study was approved by the ethics committees of the Children's Hospital, 
Zhejiang University School of Medicine, and the methods were carried out in 
accordance with the approved guidelines. Informed consent was obtained from all 
participants or parents/caregivers of all the subjects who were studied. 

Identification of candidate SNPs and genotyping. The candidate genes involved in 
the Ca 2+ /NFAT pathway were selected based on recent findings on genetic variants of 
KD, previous association studies and biological evidences, including 7 genes {ITPKC, 
CASP3, STIM1, STIM2, NFATcl, NFATc2 and IL-2). The screening procedure of 
SNPs which were predicted to be functional in these genes was described as follows. 
First, we extracted the ranges of the physical positions covering the 7 genes and their 
2 Kb range up- and downstream from the HapMap database (http://hapmap.ncbi. 
nlm.nih.gov/, HapMap Data Rel 24/phaseII Nov08, on NCBI B36 assembly, dbSNP 
bl26), CHB. Second, we placed the position of each gene into an integrated 
bioinformatics tool "SNPinfo" 43 (http://snpinfo.niehs.nih.gov/snpinfo/snpfunc.htm) 
and retrieved a set of SNPs with possible functions. Third, we filtered these SNPs by 
MAF of CHB > 10%, and got a total of 52 SNPs in the 7 candidate genes. As shown in 
Supplementary Table S3 online, the detailed information of all the SNPs were listed, 
including the possible functions of allelic difference which the SNPinfo tool can 
predict (marked with a Y letter), such as TFBS, splicing site, micro RN A -binding site, 
nonsynonymous SNP, polymorphism phenotyping. Forth, we tested the LD among 
these SNPs by Haploview V4.2 48 (Supplementary Fig. SI online), and SNPs in strong 
LD with each other (r 2 > 0.80) were considered redundant and only one was reserved. 
As a result, a total of 32 SNPs were remained for further selection (Supplementary 
Table S4 online). Since a few of these SNPs, such as ITPKC rs7251246, STIM1 
rsl2806698, did not present in the results of Haploview, we rechecked the LD of these 
32 SNPs using the online database of 1000 Genomes (http://www.1000genomes.org/), 
and found that several pairs of SNPs were in strong LD with each other 
(Supplementary Table S5 online). Therefore, 27 SNPs were obtained applying the 
similar filtering procedure (Supplementary Table S6 online). Next, we set priority of 
SNPs predicted to harbor multiple functional motifs if there also existed SNPs with 
single prediction of functional effect in the same gene (Supplementary Table S7 
online). Afterwards we deleted five SNPs (rs9962479, rs4811191, rs2581732, rs9518 
and rs4647601), among the genes which harbored more than one probably functional 
SNP. 

Moreover, we added CASP3 rsl 13420705 and ITPKC rs28493229, which have been 
previously identified as biologically functional and associated with KD risk 1219 . Thus, 
a total of 16 SNPs were collected for this study (Supplementary Table SI online). 

The allelic differences of effect on DNA motif which the SNPinfo could predict 
were shown in Supplementary Table S8 online. For different possible functions, the 
prediction scores were calculated by different methods, such as Match 49 , miRanda 50 , 
ESEfind 51 and PolyPhen 52 , and if there were various pairs of scores for two alleles of a 
given SNP, we chose the one which has the maximal difference score between the two 
alleles. The only exception was rs2290692, with the same score between the two 
alleles. Since it has been found associated with KD susceptibility in a Chinese 
population in a previous study 14 , we reserved it for further study. 

Genomic DNA was extracted from 2 mL peripheral blood sample that was col- 
lected from each participant at recruitment, using the RelaxGene Blood System 
DP319-02 (Tiangen, Beijing, China) by reference to the manufacturer's instructions. 
The concentration and the optical density of DNA were confirmed by NanoDrop 
1000 spectrophotometer (Thermo Fisher Scientific, Waltham, Massachusetts, USA). 
All the SNPs were genotyped using TaqMan OpenArray Genotyping Assay System 
(Applied Biosystems, Foster City, CA, USA), the main procedure of which was 
described as follows. The final reaction volume per well was 4 uL for a 384-well plate, 
including TaqMan OpenArray Mater Mix, 2 X (2 uL) and normalized DNA sample 
(2 uL), the former of which was customized by Applied Biosystems. We applied the 
OpenArray AccuFill System (Applied Biosystems, Foster City, CA, USA) to load the 
samples into OpenArray plates (32 X 96 format) automatically, then we placed 
the loaded plates into cycling case, UV cured for 90 seconds. Next, we loaded the 
plates into the thermal cycler with the initial temperature of 93°C for 10 minutes, 
followed by 50 cycles of 95°C for 45 seconds, 94°C for 13 seconds and 53°C for 134 
seconds, and after that, the temperature dropped to 25°C for 2 minutes and was 
maintained at 4°C. Finally, allelic discrimination plate read and analysis were per- 
formed using TaqMan OpenArray System, followed the manufacturer's protocol. 

Notably, genotyping was performed without knowing the status of the participants. 
A 5% masked, random samples were genotyped twice for quality control, and the 
reproducibility was 100%. 

Statistical methods. SNPs with genotyping call rates <95% or those that deviated 
from the HWE in controls (P < 0.05) were excluded. The HWE for genotypes was 
assessed by a goodness-of-fit y 2 test in the control group. The distribution differences 
in gender and genotype frequencies between cases and controls were examined by 
Pearson's y 2 test or Fisher's exact test, when appropriate. Unconditional LR analysis 
was then applied to estimate ORs and their 95% CIs for the effects of the SNPs on KD 
susceptibility, under assumption that variant alleles were the risk alleles. In order to 
increase the statistical power, the most likely inheritance model for each SNP was 
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selected instead of three models calculated simultaneously. The statistical power to 
detect the effects of the SNPs was calculated by Power v3.0.0 53,54 , and for example, for 
SNPs with minor allele frequency of 0.070, 0.244 and 0.500, we calculated that the 
power for our sample size to detect an odds ratio of 1.50 was 0.40, 0.79 and 0.86, 
respectively. 

The potential gene-gene interactions were examined by CART, which is a non- 
parametric technique and does not require assumptions about the distribution of the 
data. A CART tree is constructed by splitting data recursively into binary subsamples, 
beginning with the root node that contains all the learning sample, and leading to the 
formation of daughter nodes (nodes that can be split further) and terminal nodes 
(nodes that cannot be split any further). Gini criteria are used to achieve a high degree 
of homogeneity in the terminal nodes or subgroups. After a large tree is grown, which 
tends to be complex and difficult to interpret, a pruning procedure is performed to 
avoid overfitting the model. In the final stage, the optimal tree is selected based on the 
lowest misclassification error rate, which can be assessed based on cross validation 55 . 
Subgroups of individuals with differential risk associations with KD were identified in 
the different terminal nodes of the tree, indicating potential interactions. Finally, the 
risk of these subgroups was evaluated by LR analysis with the least percentage of cases 
as the reference and ORs and 95% CIs adjusted for gender 28 . After CART analysis, 
multiple risk of the variants potentially interacted with each other was evaluated by a 
cumulative effect analysis. 

All of the above statistical analyses were conducted by SPSS software vl3.0 (SPSS, 
Inc., Chicago, IL) and all P values were two-tailed with a statistically significant level at 
0.05. 
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